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Abstract 

Within  the  field  of  power  generation,  aging  assets  and  a 
desire  for  improved  maintenance  decision-making  tools 
have  led  to  growing  interest  in  asset  prognostics.  Valve 
failures  can  account  for  7%  or  more  of  mechanical  failures, 
and  since  a  conventional  power  station  will  contain  many 
hundreds  of  valves,  this  represents  a  significant  asset  base. 
This  paper  presents  a  prognostic  approach  for  estimating  the 
remaining  useful  life  (RUL)  of  valves  experiencing 
degradation,  utilizing  a  similarity-based  method.  Case  study 
data  is  generated  through  simulation  of  valves  within  a 
400MW  Combined  Cycle  Gas  Turbine  power  station.  High 
fidelity  industrial  simulators  are  often  produced  for  operator 
training,  to  allow  personnel  to  experience  fault  procedures 
and  take  corrective  action  in  a  safe,  simulation  environment, 
without  endangering  staff  or  equipment.  This  work 
repurposes  such  a  high  fidelity  simulator  to  generate  the 
type  of  condition  monitoring  data  which  would  be  produced 
in  the  presence  of  a  fault.  A  first  principles  model  of  valve 
degradation  was  used  to  generate  multiple  run-to-failure 
events,  at  different  degradation  rates.  The  associated 
parameter  data  was  collected  to  generate  a  library  of  failure 
cases.  This  set  of  cases  was  partitioned  into  training  and  test 
sets  for  prognostic  modeling  and  the  similarity  based 
prognostic  technique  applied  to  calculate  RUL.  Results  are 
presented  of  the  technique’s  accuracy,  and  conclusions  are 
drawn  about  the  applicability  of  the  technique  to  this 
domain. 


Mark  McGhee  et  al.  This  is  an  open-access  article  distributed  under  the 
terms  of  the  Creative  Commons  Attribution  3.0  United  States  License, 
which  permits  unrestricted  use,  distribution,  and  reproduction  in  any 
medium,  provided  the  original  author  and  source  are  credited. 


1.  Introduction 

Within  electrical  power  utilities  there  is  an  increasing 
demand  for  condition  monitoring  methods  capable  of 
reliably  predicting  the  RUL  of  assets  (Sheppard  &  Kaufman 
2009).  This  requirement  is  driven  by  the  need  to  improve 
maintenance  costs  and  scheduling,  as  well  as  safety 
considerations  (Chen,  Yang  &  Zheng  2012).  The  field  of 
prognostics  has  made  great  advances  in  areas  with  high 
requirements  on  safety  and  dependability,  such  as  aerospace 
and  the  nuclear  industry.  However  within  the  power 
generation  field,  prognostic  applications  have  not  been 
implemented  to  the  same  degree.  This  is  mainly  due  to  the 
challenges  of  gathering  sufficient  data  to  enable  robust 
testing  and  validation,  as  such  systems  are  rarely  allowed  to 
run  to  failure  (Heng,  Tan,  Mathew,  Montgomery,  Banjevic, 
&  Jardine,  2009). 

Within  power  generation,  implementation  of  prognostic 
methods  would  enable  operators  to  reduce  maintenance  and 
unplanned  downtime  by  utilizing  predictive  maintenance 
policies  in  place  of  a  time  based  maintenance  approach 
(Vachtsevanos,  Lewis,  Roemer,  Hess  &  Wu,  2006)  (Sun, 
Zeng,  Kang  &  Pecht  2012).  However,  there  is  a  high  cost 
associated  with  creating  physical  test  systems  from  which  to 
gather  run-to-failure  data.  Additionally,  gathering, 
understanding,  and  transforming  data  provided  by  on-site 
industrial  facilities  into  a  comprehensive  and  reliable  model 
is  a  costly  and  difficult  undertaking  (Wenbin  &  Carr  2010), 
with  operators  often  reluctant  to  provide  commercially 
sensitive  data. 


One  way  to  overcome  this  lack  of  failure  data  is  to  utilize 
simulation  of  assets  to  generate  the  data  required.  Lollowing 
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this  route,  this  paper  proposes  the  simulation  of  degradation 
of  valves  within  a  power  plant  environment  to  create  a 
similarity-based  prognostic  model.  Within  a  plant 
environment,  valves  have  been  highlighted  as  a  common 
source  of  faults,  accounting  for  at  least  7%  of  mechanical 
failures  (Radu,  Mladin  &  Prisecaru,  2013)  (Latcovich, 
Astrom,  Frankhuizen,  Fukushima,  Hamberg  &  Keller, 
2005),  and  with  many  hundreds  of  valves  present  in  a 
typical  generation  plant  (Westinghouse  Nuclear,  2013), 
valves  are  a  critical  asset  which  could  benefit  from  a 
prognostic  system. 

Within  power  generation,  simulators  have  been  widely 
deployed,  particularly  within  the  nuclear  sector,  for  training 
purposes  focused  on  improving  operational  safety  (Harrison, 
2013).  Such  simulators  are  used  primarily  for  training  and 
are  certified  as  high  fidelity  tools  and  thereby  the  model  and 
sensor  data  are  within  industrially  accepted  tolerances  of 
actual  plant  values.  Utilizing  such  high  fidelity  simulators 
negates  the  need  for  the  creation  of  physical  test  beds,  as 
well  as  providing  an  industrial  acceptance  and  robustness  to 
the  simulated  data  generated  (McGhee,  Catterson,  McArthur 
and  Harrison,  2013). 

The  similarity-based  prognostic  method  used  here  is  based 
on  an  approach  by  Wang,  Yu  Siegel  and  Lee  (2008).  This 
similarity  method  has  particular  application  benefits  to  the 
simulation  approach  proposed  here.  With  simulation,  the 
large  number  of  run-to-failure  cases  needed  for  a  similarity 
based  approach  can  be  generated  easily.  The  use  of 
simulation  can  also  satisfy  the  requirements  stated  by  Wang 
et  al.  (2008)  for  a  successful  implementation: 

1)  Multiple  recordings  of  run-to-failure  data  are  available, 

2)  The  data  recorded  ends  when  the  point  of  failure  is 
reached,  and 

3)  The  data  covers  a  representative  set  of  components. 

2.  Methodology 

This  section  discusses  the  creation  of  the  valve  failure 
model  and  the  prognostic  RUL  model.  A  diagram  of  the 
process  is  shown  in  Figure  1 . 

2.1.  Valve  model  simulation 

The  valve  model  was  created  from  first  principles, 
simulating  fluid  flow  within  a  cylindrical  pipe: 


P2  =  Pi+\p(y12-v22) 

(1) 

AtVt  =  A2V2 

(2) 

Where  Pi,  Vi  and  Ai  correspond  to  the  pressure,  fluid  flow 
and  area  of  the  pipe  entering  the  valve,  P2,  V2  and  A2 
correspond  to  the  pressure,  fluid  flow  and  area  of  the  pipe  at 


the  point  of  degradation  and  p  describes  the  density  of  the 
fluid.  Parameter  values  for  the  model  are  taken  from  an 
industrial  Combined  Cycle  Gas  Turbine  (CCGT)  plant 
simulator. 


Valve  Degradation  Data  Generated 

1 

□ 

Rearrange  Generated  Data  by  Health  Index 

n 

Use  Fitting  function  on  Rearranged  Data 

L 

Distance  Evaluation  -  Compare  Test  Data 

With  Training  Data 

1 

u 

l 

Evaluate  RUL 

Figure  1 .  Procedure  of  RUL  estimation 


The  degradation  is  represented  by  a  decreasing  area  A2 
where  the  initial  area  of  the  pipe  Ai  is  constricted  over  time. 
This  is  represented  by  a  degradation  coefficient,  5,  which  is 
a  numerical  constant  between  0  and  0.0001,  drawn  from  a 
standard  uniform  distribution,  describing  the  rate  of 
decrease  in  the  flow  area. 

i42(t+l)=il1(0)-5i41(t)  (3) 


This  degradation  can  represent  debris  build  up  along  the 
area  of  flow,  or  “sticky  valve  failure”  where  the  valve  no 
longer  fully  closes  or  opens.  A  single  run-to-failure  event 
from  initial  healthy  operating  conditions  to  end  of  life  can 
be  seen  in  Figure  2,  and  a  batch  of  50  run-to-failure  events 
can  be  seen  in  Figure  3.  For  this  study,  the  end  of  life  is 
considered  to  be  P2  =  0,  i.e.  completely  blocked  flow. 
However,  in  a  power  station  deployment,  maintenance 
intervention  would  be  triggered  significantly  before  this 
threshold  is  reached. 

This  modeling  approach  corresponds  to  the  way  components 
and  faults  are  modeled  in  the  industrial  plant  simulator  used 
in  the  research.  The  plant  simulator  uses  first  principles 
equations  based  on  pressure,  fluid  flow  and  flow  area  to 
model  pipes  and  valves. 

The  modeling  choices  also  need  to  be  made  with  respect  to 
the  sensors  and  data  readily  available  to  station  operators. 
Theoretically,  measurement  points  could  be  placed  at  any 
point  in  the  plant  model,  and  the  parameter  value  recorded 
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as  if  from  instrumentation.  However,  for  the  prognostic 
model  to  translate  directly  from  the  plant  simulator  to  the 
real  plant  environment,  any  measurements  utilized  by  the 
prognostic  model  must  be  realistic  points  for 
instrumentation  to  be  located.  Therefore,  only  those 
parameters  which  would  normally  be  recorded  around  a 
valve  are  considered. 


Figure  2.  A  single  run-to-failure  event 


Time 

Figure  3.  50  run-to-failure  events 

For  this  study,  the  training  data  comprised  50  sets  of  time 
stamped  pressure  values,  corresponding  to  P2  in  Eq.  (1), 
from  an  initial  value  equal  to  Pi  down  to  0.  The  simulated 
frequency  of  data  capture  is  set  at  once  per  hour.  For  this 
case,  the  parameters  taken  from  the  CCGT  were  an  initial 
pressure  Pi=18  Pa,  area  Ai=10  cm2  and  flow  Vi=185kg/s. 

To  represent  measurement  noise,  each  data  point  had  a  noise 
term  added,  drawn  from  a  Gaussian  distribution  with  mean 
0  and  standard  deviation  0.0005. 


2.2.  Prognostic  model 

The  procedure  for  creating  the  similarity-based  prognostic 
model  is  split  into  three  steps  (Wang  et  al.,  2008).  The  first 
two,  described  in  sections  2.2.1  and  2.2.2,  are  data 
preparation  steps  applied  to  both  training  and  test  data.  The 
third  step  compares  the  test  data  set  against  the  training  data. 
Of  55  run-to-failure  events  simulated,  50  were  used  as 
training  data,  with  five  for  testing. 

2.2.1.  Arrangement  by  health  index 

The  initial  stage  is  to  rearrange  the  data  to  create  a  Health 
Index  (HI).  The  HI  is  used  to  describe  the  condition  of  the 
asset.  Near  the  start  of  life  the  asset  is  assumed  to  be  in  a 
healthy  condition  and  assigned  the  value  1,  whilst  the 
unhealthy  or  near  end-of-life  condition  is  assigned  the  value 
0.  This  HI  is  then  applied  to  every  data  run  and  the  data 
rearranged  according  to  the  asset’s  time-to-failure  (Figure 
4).  As  shown  in  Figure  4,  the  start  of  life  (healthy)  and  end 
of  life  (unhealthy)  values  correspond  to  P=18  and  P= 0 
respectively. 


Figure  4.  Training  set  comprising  50  run-to-failure  events 
rearranged  according  to  HI 

Polynomial  fitting 

Having  rearranged  the  data  according  to  the  HI,  each  run-to- 
failure  event  is  then  fitted  using  a  polynomial  function 
which  best  describes  the  event  progress.  In  the  specific  case 
of  this  valve  degradation  example,  the  fault  progression 
looks  to  approximate  a  linear  fit.  However,  in  other  cases 
the  best  fit  may  be  a  higher  order  polynomial  or  other 
function.  In  this  case  the  polynomial  fit  is: 

/(x)  =  ax  +  b  (4) 
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where  a  and  b  are  the  model  parameters.  This  polynomial 
curve  is  fitted  to  the  HI  for  every  run-to-failure  event  with 
the  least  squares  fitting  approach. 

2.2.2.  Distance  Evaluation 

To  determine  the  RUL  of  the  test  runs,  a  sample  of  data 
from  near  the  start  of  each  test  is  selected.  In  the  examples 
below,  time  steps  50-100  are  chosen  to  represent  the  current 
and  recent  historic  condition  of  the  valve.  This  data  is  then 
compared  against  every  50  time  step  segment  of  each 
training  data  polynomial  fit  until  the  closest  match  to  the 
test  is  found.  The  distance  evaluation  is  determined  by: 


d(r,Y,i )  =  ^ 
j= i 


ty-jK-T-r+py 


(5) 


Best  training  fit  =  26;  RUL  =  230;  True  RUL  =  239 


r  r _ 
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where  d  is  the  distance  of  the  test  data  from  the  training  data 
sample,  y  is  the  position  of  the  test  data  (time  step  number), 
is  the  polynomial  curve  fitted  to  the  ith  training  data 
sample,  r  is  the  length  of  the  test  data  Y,  r  is  the  number  of 
time  steps  Y  is  shifted  from  0  and  o  is  the  RMS  error  from 
the  polynomial  fit. 

Once  the  distance  between  the  test  run  and  all  windows  of 
all  training  runs  is  established,  the  estimated  RUL  is  chosen 
by  selecting  the  training  run  sample  with  the  smallest 
distance  d  (i.e.  the  most  similar  run-to-failure  event).  The 
RUL  from  that  point  of  the  training  run  is  the  estimated 
RUL  for  the  test  run. 

3.  Experimental  Results 

The  five  test  runs  are  summarized  in  Table  1  and  shown  in 
Figures  5  -  9.  As  can  be  seen,  the  true  RUL  of  each  test  run 
compares  well  with  the  predicted  RUL  value. 

Table  1.  Summary  of  Test  run  results  with  associated 
Estimated  RUL  and  True  RUL 


Test  Run 

Est  RUL 

True  RUL 

1 

230 

239 

2 

898 

889 

3 

631 

624 

4 

673 

638 

5 

1204 

1195 

Figure  5.  Test  run  1:  Estimated  RUL  =  230,  True  RUL  = 
239 

Best  training  fit  =  27;  RUL  =  898;  True  RUL  =  889 


Timead^ 

Figure  6.  Test  run  2:  Estimated  RUL  =  898,  True  RUL  = 
889 

Best  training  fit  =  34;  RUL  =  631;  True  RUL  =  624 


Figure  7.  Test  run  3:  Estimated  RUL  =  631,  True  RUL  = 
624 
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Best  training  fit  =  3;  RUL  =  673;  True  RUL  =  638 


Timeadj 


Figure  8.  Test  run  4:  Estimated  RUL  =  673,  True  RUL  = 
638 

Best  training  fit  =  43;  RUL  =  1204;  True  RUL  =1195 


Figure  9.  Test  run  5:  Estimated  RUL  =  1204,  True  RUL  = 
1195 

These  results  are  considered  accurate  enough  for  the 
application  domain,  being  within  10  hours  of  the  actual 
RUL  in  most  cases,  and  35  hours  in  the  worst  case.  While 
this  technique  estimates  the  time  to  complete  failure  (zero 
flow),  in  a  power  station  maintenance  would  be  triggered  by 
a  reduction  in  flow,  significantly  before  failure.  The 
estimation  of  RUL  gives  an  indicative  window  of  time  in 
which  maintenance  could  or  should  be  performed,  thus 
providing  support  to  maintenance  planning.  Future  work 
will  consider  how  far  in  advance  of  estimated  failure  a 
maintenance  trigger  should  be  set,  bearing  in  mind 
uncertainties  in  the  RUL  prediction. 

The  high  accuracy  of  the  case  study  RUL  predictions  is  due 
to  the  range  of  failures  included  in  the  training  data  set, 
which  is  due  in  turn  to  the  use  of  simulation.  With  the  high 
fidelity  plant  simulator,  plant  conditions  can  be  varied  and 


reset  for  multiple  fault  runs,  generating  as  many  failure 
examples  as  desired. 

There  is  potential  for  this  similarity  based  prognostic 
method  to  be  improved  further,  with  a  larger  training  data 
set  containing  a  greater  breadth  of  degradation  and  failure 
cases.  Future  work  will  consider  how  large  the  training  set 
needs  to  be,  and  how  to  integrate  actual  valve  failure  data  as 
it  becomes  available. 

However,  as  more  training  data  is  added,  RUL  selection 
becomes  more  complex.  Future  extensions  of  this  technique 
may  need  to  consider  implementing  different  methods  of 
distance  evaluation,  to  retain  prediction  accuracy.  Also,  as 
this  method  relies  on  training  using  run-to -failure  data,  it  is 
limited  to  accurate  prediction  of  previously  seen  fault  types. 

4.  Conclusions 

The  similarity-based  prognostic  approach  described  in  this 
paper  provided  accurate  results  when  estimating  RUL  of 
valves  within  a  power  station.  This  research  utilizes  a  high 
fidelity  CCGT  plant  simulator  to  allow  the  creation  of  a 
large  suite  of  failure  cases,  simulating  a  relatively  low  risk 
but  high  consequence  failure  mode  for  which  there  is 
limited  in-service  data.  This  paper  demonstrates  a  method  of 
first  principles  modeling  of  failure,  in  order  to  generate  the 
data  required  for  data-driven  prognostic  modeling.  This  is 
shown  to  accurately  predict  the  remaining  life  of  five  test 
cases. 

Having  tested  the  method  there  are  a  number  of  possible 
routes  now  available  for  further  research  using  this 
approach:  testing  the  approach  with  real  plant  data,  applying 
the  prognostic  method  to  different  types  of  faults,  and 
comparing  this  technique  to  other  prognostic  techniques  for 
similar  applications. 
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